Monitoring Membership Changes in a Fault-Tolerant Distributed System

نویسندگان

  • M. R. King
  • L. E. Moser
  • P. M. Melliar-Smith
  • D. A. Agarwal
چکیده

The Totem protocol supports the maintenance of consistency of replicated information in fault-tolerant distributed systems by providing reliable totally ordered delivery of messages. The membership algorithm of Totem maintains a consistent view of processors in a local-area network and handles all aspects of reconnguration, including restarting of failed processors and remerging of partitioned networks. In this paper we describe a network monitor with graphical output that we have constructed for the Totem protocol development environment. The development environment executes the protocol object modules unmod-iied while simulating network communication, timing, and fault injection. The network monitor collects information from the processors within the development environment and displays global membership information and other data in an intuitive fashion.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

First Steps in the Implementation of a Fault - Tolerant

Transis ADKM92,AAD93,ADM + 93] is a tool for group communication that provides reliable ordered multicast along with membership services and strong group semantics. Transis can currently be used by processes residing on nodes within a BCD (Broadcast Domain). Building distributed applications on top of these services enables the programmer to assume ordering constraints on message delivery even ...

متن کامل

Toward a Solution to Partitionable Group Membership for MANETs

Ubiquitous computing environments are characterised by a diversity of mobile nodes and networks, and in particular, mobile ad-hoc networks. Mobile Ad-hoc NETworks (MANETs) are self-organising networks that lack a fixed infrastructure, and due to nodes arrivals, departures, crashes and movements, they are very dynamic networks. The topology changes occur both rapidly and unexpectedly and nodes (...

متن کامل

Node Failure Detection and Membership in CANELy

Fault-tolerant distributed systems based on fieldbuses may benefit to a great extent from the availability of semantically rich communication services, such as those provided by group communication, clock synchronization, membership and failure detection. This is specially true of distributed critical control applications. However, the migration of those services to the realm of simple fieldbus...

متن کامل

A Model for Adaptive Fault-Tolerant Systems

An adaptive computing system is one that modiies its behavior based on changes in the environment. Since one common type of environment change in a distributed system is network or processor failure, fault-tolerant distributed systems can be viewed as an important subclass of adaptive systems. As such, use of adaptive methods for dealing with failures in this context has the same potential adva...

متن کامل

Architecture and Protocols for Fault-Tolerant Distributed Objects

We present an architecture and protocols for distributed fault-tolerant objects. The architecture and protocols are based on a novel object-centric formulation of the reliable group multicast problem which we refer to as the Group State Synchronization Problem (GSSP). This formulation allows us to capture and represent fault-tolerance semantics at the object level. The GSSP formulation has thre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995